Comparison of TDLeaf(λ) and TD(λ) Learning in Game Playing Domain

نویسندگان

  • Daniel Osman
  • Jacek Mańdziuk
چکیده

In this paper we compare the results of applying TD(λ) and TDLeaf(λ) algorithms to the game of give-away checkers. Experiments show comparable performance of both algorithms in general, although TDLeaf(λ) seems to be less vulnerable to weight over-fitting. Additional experiments were also performed in order to test three learning strategies used in self-play. The best performance was achieved when the weights were modified only after non-positive game outcomes, and also in the case when the training procedure was focused on stronger opponents. TDlearning results are also compared with a pseudo-evolutionary training

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TDLeaf(lambda): Combining Temporal Difference Learning with Game-Tree Search

In this paper we present TDLeaf(λ), a variation on the TD(λ) algorithm that enables it to be used in conjunction with minimax search. We present some experiments in both chess and backgammon which demonstrate its utility and provide comparisons with TD(λ) and another less radical variant, TD-directed(λ). In particular, our chess program, " KnightCap, " used TDLeaf(λ) to learn its evaluation fun...

متن کامل

KnightCap: A Chess Programm That Learns by Combining TD(lambda) with Game-Tree Search

In this paper we present TDLeaf(λ), a variation on the TD(λ) algorithm that enables it to be used in conjunction with game-tree search. We present some experiments in which our chess program “KnightCap” used TDLeaf(λ) to learn its evaluation function while playing on the Free Internet Chess Server (FICS, fics.onenet.net). The main success we report is that KnightCap improved from a 1650 rating ...

متن کامل

ar X iv : c s . L G / 9 90 10 02 v 1 1 0 Ja n 19 99 KnightCap : A chess program that learns by combining TD ( λ ) with game - tree search

In this paper we present TDLeaf(λ), a variation on the TD(λ) algorithm that enables it to be used in conjunction with game-tree search. We present some experiments in which our chess program “KnightCap” used TDLeaf(λ) to learn its evaluation function while playing on the Free Internet Chess Server (FICS, fics.onenet.net). The main success we report is that KnightCap improved from a 1650 rating ...

متن کامل

The Application of TD(λ) Learning to the Opening Games of 19×19 Go

This paper describes the results of applying Temporal Difference (TD) learning with a network to the opening game problems in Go. The main difference from other research is that this experiment applied TD learning to the fullsized (19×19) game of Go instead of a simple version (e.g., 9×9 game). We discuss and compare TD(λ) learning for predicting an opening game’s winning and for finding the be...

متن کامل

TDLeaf( ): Combining Temporal Difference Learning with Game-Tree Search

ABSTRACT In this paper we present TDLeaf( ), a variation on the TD( ) algorithm that enables it to be used in conjunction with minimax search. We present some experiments in both chess and backgammon which demonstrate its utility and provide comparisons with TD( ) and another less radical variant, TDdirected( ). In particular, our chess program, “KnightCap,” used TDLeaf( ) to learn its evaluati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994